分割是图像处理的基本操作。卷积操作遭受有限的接收领域,而全球建模是对分段任务的基础。在本文中,我们将图形卷积应用于分割任务,并提出改进的\ Texit {Laplacian}。与现有方法不同,我们的\ Textit {Laplacian}是数据相关的,我们介绍了两个注意力对角线矩阵来学习更好的顶点关系。另外,在执行基于图形的信息传播时,它利用了区域和边界信息。具体地,我们通过学习图表表示的关于不同类的边界意识区域 - 明智相关的模型和原因,其能够操纵沿着物体边界的空间增强的各个区域的长距离语义推理。我们的模型非常适合获得全局语义区域信息,同时也可以同时容纳局部空间边界特征。两种挑战数据集的实验表明,我们的方法优于最先进的方法在结肠镜检查中的息肉中的息肉和光盘和光学杯中的光盘和光学杯在彩色眼底图像上的分割。
translated by 谷歌翻译
Correlation acts as a critical role in the tracking field, especially in recent popular Siamese-based trackers. The correlation operation is a simple fusion manner to consider the similarity between the template and the search re-
translated by 谷歌翻译
Digital human recommendation system has been developed to help customers find their favorite products and is playing an active role in various recommendation contexts. How to timely catch and learn the dynamics of the preferences of the customers, while meeting their exact requirements, becomes crucial in the digital human recommendation domain. We design a novel practical digital human interactive recommendation agent framework based on Reinforcement Learning(RL) to improve the efficiency of the interactive recommendation decision-making by leveraging both the digital human features and the superior flexibility of RL. Our proposed framework learns through real-time interactions between the digital human and customers dynamically through the state-of-art RL algorithms, combined with multimodal embedding and graph embedding, to improve the accuracy of personalization and thus enable the digital human agent to timely catch the attention of the customer. Experiments on real business data demonstrate that our framework can provide better personalized customer engagement and better customer experiences.
translated by 谷歌翻译
多元长序列时间序列预测(M-LSTF)是一个实用但具有挑战性的问题。与传统的计时器序列预测任务不同,M-LSTF任务从两个方面更具挑战性:1)M-LSTF模型需要在多个时间功能之间和多个时间序列之间学习时间序列模式; 2)在滚动预测设置下,两个连续训练样本之间的相似性随着预测长度的增加而增加,这使模型更容易过度拟合。在本文中,我们提出了一个可推广的内存驱动变压器,以靶向M-LSTF问题。具体而言,我们首先提出一个全局级内存组件,以通过集成多个时间序列功能来驱动预测过程。此外,我们采用了一种进步的方式来训练我们的模型以提高其普遍性,在这种情况下,我们逐渐向培训样品引入伯努利的噪音。已经在多个字段上对五个不同的数据集进行了广泛的实验。实验结果表明,我们的方法可以无缝地插入不同的基于变压器的模型中,以提高其性能至大约30%。特别是,这是我们最好的知识专门关注M-LSTF任务的第一项工作。
translated by 谷歌翻译
在设计聚类算法时,初始中心的选择对于学习簇的质量至关重要。在本文中,我们基于数据的构建,我们开发了一种新的初始化方案,称为$ k $ -Median问题(例如图形引起的离散空间),基于数据的构造。从树中,我们提出了一种新颖有效的搜索算法,用于良好的初始中心,随后可用于本地搜索算法。我们提出的HST初始化可以产生与另一种流行初始化方法$ K $ -Median ++的初始中心,具有可比的效率。 HST初始化也可以扩展到差异隐私(DP)的设置,以生成私人初始中心。我们表明,应用DP本地搜索后,我们的私有HST初始化会改善对近似错误的先前结果,并在小因素内接近下限。实验证明了理论的合理性,并证明了我们提出的方法的有效性。我们的方法也可以扩展到$ k $ -MEANS问题。
translated by 谷歌翻译
Minive散列(Minhash)是一种经典方法,用于有效地估计大规模二进制(0/1)数据中的Jaccrad相似性。为了为每个数据向量产生$ k $哈希值,Minhash的标准理论需要k $独立的排列。有趣的是,最近的“循环Minhash”(C-MINASH)的工作表明,仅需要两个排列。第一排列破坏了数据的结构,并且第二个置换以循环方式重新使用$ K $时间。令人惊讶的是,证明C-MINHASH的估计准确性被严格小于原始MINAHASH的精度。最近的工作进一步证明,实际上只需要一个排列。请注意,C-MINHASH与在NIPS'12中发布的“一个权限散列(oph)”的众所周知的工作不同。使用不同“致密化”方案的OPH及其变体是标准Minhash的流行替代品。致密化步骤是必要的,以便处理存在于一个处于一个置换散列中的空箱。在本文中,我们建议纳入C-MINHASH的基本思想,以提高一个置换散列的准确性。基本上,我们为OPH开发了一种新的致密化方法,而与OPH的所有现有的致密化方案相比,实现了最小的估计方差。我们所提出的方法名为C-OPH(循环oph)。在初始排列(缩小数据的现有结构)之后,C-OPH只需要长度$ D / k $(而不是$ d $)的“较短”排列,其中$ d $是原始数据维度和$ k $是oph中的垃圾箱总数。这种短排列以循环移位方式重新使用以美元的价格。可以表明,Jaccard相似性的估计方差严格小于现有(致密化)OPH方法的方差。
translated by 谷歌翻译
translated by 谷歌翻译
Conventional video compression approaches use the predictive coding architecture and encode the corresponding motion information and residual information. In this paper, taking advantage of both classical architecture in the conventional video compression method and the powerful nonlinear representation ability of neural networks, we propose the first end-to-end video compression deep model that jointly optimizes all the components for video compression. Specifically, learning based optical flow estimation is utilized to obtain the motion information and reconstruct the current frames. Then we employ two auto-encoder style neural networks to compress the corresponding motion and residual information. All the modules are jointly learned through a single loss function, in which they collaborate with each other by considering the trade-off between reducing the number of compression bits and improving quality of the decoded video. Experimental results show that the proposed approach can outperform the widely used video coding standard H.264 in terms of PSNR and be even on par with the latest standard H.265 in terms of MS-SSIM. Code is released at https://github.com/GuoLusjtu/DVC. * Corresponding author (a) Original frame (Bpp/MS-SSIM) (b) H.264 (0.0540Bpp/0.945) (c) H.265 (0.082Bpp/0.960) (d) Ours ( 0.0529Bpp/ 0.961
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译